Algorithmic Redistribution Methods for Block-Cyclic Decompositions
نویسندگان
چکیده
This research aims at creating and providing a framework to describe algorithmic redistribution methods for various block cyclic decompositions. To do so properties of this data distribution scheme are formally exhibited. The examination of a number of basic dense linear algebra operations illustrates the application of those properties. This study analyzes the extent to which the general two-dimensional block cyclic data distribution allows for the expression of e cient as well as exible matrix operations. This study also quanti es theoretically and practically how much of the e ciency of optimal block cyclic data layouts can be maintained. The general block cyclic decomposition scheme is shown to allow for the expression of exible basic matrix operations with little impact on the performance and e ciency delivered by optimal and restricted kernels available today. Second, block cyclic data layouts, such as the purely scattered distribution, which seem less promising as far as performance is concerned, are shown to be able to achieve optimal performance and e ciency for a given set of matrix operations. Consequently, this research not only demonstrates that the restrictions imposed by the optimal block cyclic data layouts can be alleviated, but also that e ciency and exibility are not antagonistic features of the block cyclic mappings. These results are particularly relevant to the design of dense linear algebra software libraries as well as to data parallel compiler technology. vi
منابع مشابه
Efficient Methods for kr R r and r R kr Array
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance tradeoff between the efficiency of new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present efficient algorithms for...
متن کاملIrregular Redistribution Scheduling by Partitioning Messages
Dynamic data redistribution enhances data locality and improves algorithm performance for numerous scientific problems on distributed memory multi-computers systems. Regular data distribution typically employs BLOCK, CYCLIC, or BLOCK-CYCLIC(c) to specify array decomposition. Conversely, an irregular distribution specifies an uneven array distribution based on user-defined functions. Performing ...
متن کاملAddendum to: "Infinite-dimensional versions of the primary, cyclic and Jordan decompositions", by M. Radjabalipour
In his paper mentioned in the title, which appears in the same issue of this journal, Mehdi Radjabalipour derives the cyclic decomposition of an algebraic linear transformation. A more general structure theory for linear transformations appears in Irving Kaplansky's lovely 1954 book on infinite abelian groups. We present a translation of Kaplansky's results for abelian groups into the terminolo...
متن کاملInfinite-dimensional versions of the primary, cyclic and Jordan decompositions
The famous primary and cyclic decomposition theorems along with the tightly related rational and Jordan canonical forms are extended to linear spaces of infinite dimensions with counterexamples showing the scope of extensions.
متن کاملMulti-phase array redistribution: modeling and evaluation
s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 Table 1: Execution times (ms) for cyclic(s) to cyclic(t) redistribution on 32 processors. other block sizes t. Fig. 3 shows the total times in milliseconds for a cyclic(192) to cyclic(8) redistribution on 32 processors for increasing data sizes. This redistribution corresponds to the cyclic(Y t) to cyclic(t) case with Y = 2...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Parallel Distrib. Syst.
دوره 10 شماره
صفحات -
تاریخ انتشار 1999